Sarasota County
- North America > United States > Arizona (0.04)
- North America > United States > Texas (0.04)
- North America > United States > Florida > Sarasota County > Sarasota (0.04)
- Europe > Italy > Sicily > Palermo (0.04)
- Research Report > New Finding (0.93)
- Research Report > Experimental Study (0.92)
Nonnegative Matrix Factorization through Cone Collapse
Nguyen, Manh, Pimentel-Alarcón, Daniel
Nonnegative matrix factorization (NMF) is a widely used tool for learning parts-based, low-dimensional representations of nonnegative data, with applications in vision, text, and bioinformatics. In clustering applications, orthogonal NMF (ONMF) variants further impose (approximate) orthogonality on the representation matrix so that its rows behave like soft cluster indicators. Existing algorithms, however, are typically derived from optimization viewpoints and do not explicitly exploit the conic geometry induced by NMF: data points lie in a convex cone whose extreme rays encode fundamental directions or "topics". In this work we revisit NMF from this geometric perspective and propose Cone Collapse, an algorithm that starts from the full nonnegative orthant and iteratively shrinks it toward the minimal cone generated by the data. We prove that, under mild assumptions on the data, Cone Collapse terminates in finitely many steps and recovers the minimal generating cone of $\mathbf{X}^\top$ . Building on this basis, we then derive a cone-aware orthogonal NMF model (CC-NMF) by applying uni-orthogonal NMF to the recovered extreme rays. Across 16 benchmark gene-expression, text, and image datasets, CC-NMF consistently matches or outperforms strong NMF baselines-including multiplicative updates, ANLS, projective NMF, ONMF, and sparse NMF-in terms of clustering purity. These results demonstrate that explicitly recovering the data cone can yield both theoretically grounded and empirically strong NMF-based clustering methods.
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- (4 more...)
Explainable RL Policies by Distilling to Locally-Specialized Linear Policies with Voronoi State Partitioning
Deproost, Senne, Steckelmacher, Dennis, Nowé, Ann
Deep Reinforcement Learning is one of the state-of-the-art methods for producing near-optimal system controllers. However, deep RL algorithms train a deep neural network, that lacks transparency, which poses challenges when the controller has to meet regulations, or foster trust. To alleviate this, one could transfer the learned behaviour into a model that is human-readable by design using knowledge distilla- tion. Often this is done with a single model which mimics the original model on average but could struggle in more dynamic situations. A key challenge is that this simpler model should have the right balance be- tween flexibility and complexity or right balance between balance bias and accuracy. We propose a new model-agnostic method to divide the state space into regions where a simplified, human-understandable model can operate in. In this paper, we use Voronoi partitioning to find regions where linear models can achieve similar performance to the original con- troller. We evaluate our approach on a gridworld environment and a classic control task. We observe that our proposed distillation to locally- specialized linear models produces policies that are explainable and show that the distillation matches or even slightly outperforms the black-box policy they are distilled from.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > Belgium > Flanders (0.04)
- Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
- (10 more...)
PrefixNLI: Detecting Factual Inconsistencies as Soon as They Arise
Harary, Sapir, Hirsch, Eran, Slobodkin, Aviv, Wan, David, Bansal, Mohit, Dagan, Ido
Natural Language Inference (NLI) models have been used in various ways to improve the factuality of LLM outputs. This is typically done by applying an NLI model to judge whether the model output is entailed from the supposed evidence, triggering some corrective actions, such as beam reranking at inference time or RL rewards during training. While NLI models are trained to detect factual inconsistencies over complete sentences, decisions in the common autoregressive generation architecture are made for each evolving text prefix, during decoding. Addressing this setting, we generalize the entailment detection task to apply over arbitrary text prefixes, and suggest its utility for improving generation faithfulness. Providing suitable evaluation and training datasets for this task, we train MiniTruePrefixes, a novel specialized model that better detects factual inconsistencies over text prefixes, outperforming comparable baseline NLI models by 5-14 F1 points in prefix-level entailment. We further demonstrate that integrating MiniTruePrefixes into a controlled decoding framework substantially improves factual consistency in abstractive summarization. When guided by MiniTruePrefixes, LLaMA-3.2-3B-Instruct matches the faithfulness and runtime of the 8B model from the same model family, while using only half the memory.
- Europe > Romania (0.04)
- Europe > Netherlands (0.04)
- South America > Argentina (0.04)
- (16 more...)
Scenes From Saturday's Nationwide 'No Kings' Protests
Organizers say the "No Kings" protests drew more than 7 million people across 2,700 cities. The crowds included high-profile politicians, A-list celebrities, and more than a few creative inflatables. On Saturday, crowds gathered in cities across the United States to protest President Donald Trump and his administration. Organizers of the No Kings rallies claim that more than 7 million people attended in all, across 2,700 cities in the Unites States and beyond. The gatherings provided a clear picture not only of how widespread the resistance to the Trump administration has become, but also the diversity of the coalition driving it.
- North America > United States > California > Los Angeles County > Los Angeles (0.18)
- North America > United States > Florida > Sarasota County > Venice (0.17)
- North America > United States > Oregon > Multnomah County > Portland (0.16)
- (19 more...)
FactAppeal: Identifying Epistemic Factual Appeals in News Media
Mor-Lan, Guy, Sheafer, Tamir, Shenhav, Shaul R.
How is a factual claim made credible? We propose the novel task of Epistemic Appeal Identification, which identifies whether and how factual statements have been anchored by external sources or evidence. To advance research on this task, we present FactAppeal, a manually annotated dataset of 3,226 English-language news sentences. Unlike prior resources that focus solely on claim detection and verification, FactAppeal identifies the nuanced epistemic structures and evidentiary basis underlying these claims and used to support them. FactAppeal contains span-level annotations which identify factual statements and mentions of sources on which they rely. Moreover, the annotations include fine-grained characteristics of factual appeals such as the type of source (e.g. Active Participant, Witness, Expert, Direct Evidence), whether it is mentioned by name, mentions of the source's role and epistemic credentials, attribution to the source via direct or indirect quotation, and other features. We model the task with a range of encoder models and generative decoder models in the 2B-9B parameter range. Our best performing model, based on Gemma 2 9B, achieves a macro-F1 score of 0.73.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- North America > United States > Florida > Sarasota County > Sarasota (0.04)
- North America > Mexico > Mexico City > Mexico City (0.04)
- (6 more...)
- Government (1.00)
- Media (0.94)
- Law (0.68)
- North America > United States > Arizona (0.04)
- North America > United States > Texas (0.04)
- North America > United States > Florida > Sarasota County > Sarasota (0.04)
- Europe > Italy > Sicily > Palermo (0.04)
- Research Report > New Finding (0.93)
- Research Report > Experimental Study (0.92)
Omni-Embed-Nemotron: A Unified Multimodal Retrieval Model for Text, Image, Audio, and Video
Xu, Mengyao, Zhou, Wenfei, Babakhin, Yauhen, Moreira, Gabriel, Ak, Ronay, Osmulski, Radek, Liu, Bo, Oldridge, Even, Schifferer, Benedikt
We present Omni-Embed-Nemotron, a unified multimodal retrieval embedding model developed to handle the increasing complexity of real-world information needs. While Retrieval-Augmented Generation (RAG) has significantly advanced language models by incorporating external knowledge, existing text-based retrievers rely on clean, structured input and struggle with the visually and semantically rich content found in real-world documents such as PDFs, slides, or videos. Recent work such as ColPali has shown that preserving document layout using image-based representations can improve retrieval quality. Building on this, and inspired by the capabilities of recent multimodal models such as Qwen2.5-Omni, we extend retrieval beyond text and images to also support audio and video modalities. Omni-Embed-Nemotron enables both cross-modal (e.g., text - video) and joint-modal (e.g., text - video+audio) retrieval using a single model. We describe the architecture, training setup, and evaluation results of Omni-Embed-Nemotron, and demonstrate its effectiveness in text, image, and video retrieval.
- Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.05)
- South America > Brazil > São Paulo (0.04)
- Oceania > Australia > Queensland > Brisbane (0.04)
- (7 more...)
RAG Security and Privacy: Formalizing the Threat Model and Attack Surface
Arzanipour, Atousa, Behnia, Rouzbeh, Ebrahimi, Reza, Dutta, Kaushik
Retrieval-Augmented Generation (RAG) is an emerging approach in natural language processing that combines large language models (LLMs) with external document retrieval to produce more accurate and grounded responses. While RAG has shown strong potential in reducing hallucinations and improving factual consistency, it also introduces new privacy and security challenges that differ from those faced by traditional LLMs. Existing research has demonstrated that LLMs can leak sensitive information through training data memorization or adversarial prompts, and RAG systems inherit many of these vulnerabilities. At the same time, reliance of RAG on an external knowledge base opens new attack surfaces, including the potential for leaking information about the presence or content of retrieved documents, or for injecting malicious content to manipulate model behavior. Despite these risks, there is currently no formal framework that defines the threat landscape for RAG systems. In this paper, we address a critical gap in the literature by proposing, to the best of our knowledge, the first formal threat model for retrieval-RAG systems. We introduce a structured taxonomy of adversary types based on their access to model components and data, and we formally define key threat vectors such as document-level membership inference and data poisoning, which pose serious privacy and integrity risks in real-world deployments. By establishing formal definitions and attack models, our work lays the foundation for a more rigorous and principled understanding of privacy and security in RAG systems.
- Europe > Austria > Vienna (0.14)
- North America > United States > Florida > Hillsborough County > Tampa (0.05)
- North America > United States > Florida > Sarasota County > Sarasota (0.04)
- (2 more...)
PBiLoss: Popularity-Aware Regularization to Improve Fairness in Graph-Based Recommender Systems
Naeimi, Mohammad, Chehreghani, Mostafa Haghir
Recommender systems, especially those based on graph neural networks (GNNs), have achieved remarkable success in capturing user-item interaction patterns. However, they remain susceptible to popularity bias--the tendency to over-recommend popular items--resulting in reduced content diversity and compromised fairness. In this paper, we propose PBiLoss, a novel regularization-based loss function designed to counteract popularity bias in graph-based recommender models explicitly. PBiLoss augments traditional training objectives by penalizing the model's inclination toward popular items, thereby encouraging the recommendation of less popular but potentially more personalized content. We introduce two sampling strategies: Popular Positive (PopPos) and Popular Negative (PopNeg), which respectively modulate the contribution of the positive and negative popular items during training. We further explore two methods to distinguish popular items: one based on a fixed popularity threshold and another without any threshold, making the approach flexible and adaptive. Our proposed method is model-agnostic and can be seamlessly integrated into state-of-the-art graph-based frameworks such as LightGCN and its variants. Comprehensive experiments across multiple real-world datasets demonstrate that PBiLoss significantly improves fairness, as demonstrated by reductions in the Popularity-Rank Correlation for Users (PRU) and Popularity-Rank Correlation for Items (PRI), while maintaining or even enhancing standard recommendation accuracy and ranking metrics. These results highlight the effectiveness of directly embedding fairness objectives into the optimization process, providing a practical and scalable solution for balancing accuracy and equitable content exposure in modern recommender systems.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > Spain > Galicia > Madrid (0.04)
- (22 more...)
- Research Report > New Finding (0.67)
- Overview > Innovation (0.46)